Data Model in Python

All data in a Python program is represented by objects or by relation between objects.

Every object in Python has a type, a value and an identity. An object's type determines its supported operations as well as the possible values it can take.

In some cases, an objects value can change. WE call these type of objects mutables. Objects whose values cannot be changed are known as immutable. The object type determines it's mutability. Numbers and Strings for example, are immutable. Lists and Dictionaries are mutable.

To make this clear, lets describe what is object identity? This can be thought of an object's address in memorty. Specifically, its the memory address for value of the object. Once an object is created its identity never changes.


In [6]:
x = "hi"
hex(id(x))


Out[6]:
'0x10d7360a0'

The variable x's identity or memory address is displayed. Node the memory addresses will change everytime the code is run.

What happens if we create a new variable y and set it equal to x?


In [7]:
y = x

In [8]:
hex(id(y))


Out[8]:
'0x10d7360a0'

In [9]:
hex(id(x))


Out[9]:
'0x10d7360a0'

The address in memory is same because both point to (or reference) the same value.

Now lets give x some other value.


In [10]:
x = "Hello"
hex(id(x))


Out[10]:
'0x10f24f7d8'

In [11]:
hex(id(y))


Out[11]:
'0x10d7360a0'

Now the address is different. Lets see what happens if we set x to equal 'hi' once more


In [12]:
x = 'hi'
hex(id(x))


Out[12]:
'0x10d7360a0'

x is once again pointing to the memory address associated with 'hi'.

What does this have to do with mutability? It seems as though we were actually able to change x's value. To answer this we will show an example using a mutable object - a list for example.


In [13]:
a = [1,2,3]
hex(id(a))


Out[13]:
'0x10f1dc448'

In [14]:
a.append(4)
hex(id(a))


Out[14]:
'0x10f1dc448'

Notice what happened. We addred 4 to the list, but the memory address did not change. This is what is means to be mutable. The value in memory address '0x10f1dc448' was originally [1,2,3] and now [1,2,3,4]. The address in memory for this object's value will never change


In [15]:
a.append("#python")
a


Out[15]:
[1, 2, 3, 4, '#python']

In [16]:
hex(id(a))


Out[16]:
'0x10f1dc448'

Now lets see what happens when we assign our list 'a' to a new variable 'b'.


In [17]:
b = a

In [18]:
b


Out[18]:
[1, 2, 3, 4, '#python']

In [19]:
hex(id(b))


Out[19]:
'0x10f1dc448'

That makes sense, 'a' and 'b' both reference the same object - [1,2,3,4,'#python']

Assignment statements in Python do not copy objects, they creat bindings between a target and an object.

If we modify b, what will happen to a?


In [20]:
b[-1] = 'Python'

In [21]:
b


Out[21]:
[1, 2, 3, 4, 'Python']

In [22]:
a


Out[22]:
[1, 2, 3, 4, 'Python']

In [23]:
hex(id(a)) == hex(id(b))


Out[23]:
True

The changes made to 'b' have affected 'a' as they both point to same data. Sometimes we may not want this behavior. As a solution, we can make a copy of the object so that modifying one does not affect the other. To do so we can use the built-in 'copy' module.


In [24]:
import copy
c = copy.copy(a)

In [26]:
hex(id(a)) == hex(id(c))


Out[26]:
False

This is referred as making a shallow copy. While the values in 'a' and 'c' are the same, their respective memory addresses are different.

A shallow copy creates a container (a list in this case) - which is why the addresses in memory are different - with references to the contents of the original object.


In [27]:
hex(id(a[-1]))


Out[27]:
'0x10d648b90'

In [28]:
hex(id(c[-1]))


Out[28]:
'0x10d648b90'

The addresses in memory for the individual elements are the same for both lists. Because we have made a copy though we can now modify one list without affecting the other.


In [29]:
c[-1] = 'PYTHON'
c


Out[29]:
[1, 2, 3, 4, 'PYTHON']

In [30]:
a


Out[30]:
[1, 2, 3, 4, 'Python']

This is great, but what if we are dealing with nested mutable? For this we will use a dictionary as an example.


In [31]:
d0 = {'Key':{'nested':'string'}}
d1 = copy.copy(d0)
d1


Out[31]:
{'Key': {'nested': 'string'}}

In [32]:
d1['Key']['nested'] = 'dict'
d0 == d1


Out[32]:
True

In [33]:
d0


Out[33]:
{'Key': {'nested': 'dict'}}

Our intention was to change d1 but d0 was also changed. This is because shallow copies reference contents - they dont copy them. For this we need to use the deepcopy() function from the 'copy' module.


In [34]:
d0 = {'Key':{'nested':'string'}}
d1 = copy.deepcopy(d0)
d1


Out[34]:
{'Key': {'nested': 'string'}}

In [35]:
d1['Key']['nested'] = 'dict'
d0 == d1


Out[35]:
False

In [36]:
d1


Out[36]:
{'Key': {'nested': 'dict'}}

In [37]:
d0


Out[37]:
{'Key': {'nested': 'string'}}